智能论文笔记

BARTSmiles: Generative Masked Language Models for Molecular Representations

Gayane Chilingaryan , Hovhannes Tamoyan , Ani Tevosyan , Nelly Babayan , Lusine Khondkaryan , Karen Hambardzumyan , Zaven Navoyan , Hrant Khachatrian , Armen Aghajanyan

分类：机器学习

2022-11-29

We discover a robust self-supervised strategy tailored towards molecular representations for generative masked language models through a series of tailored, in-depth ablations. Using this pre-training strategy, we train BARTSmiles, a BART-like model with an order of magnitude more compute than previous self-supervised molecular representations. In-depth evaluations show that BARTSmiles consistently outperforms other self-supervised representations across classification, regression, and generation tasks setting a new state-of-the-art on 11 tasks. We then quantitatively show that when applied to the molecular domain, the BART objective learns representations that implicitly encode our downstream tasks of interest. For example, by selecting seven neurons from a frozen BARTSmiles, we can obtain a model having performance within two percentage points of the full fine-tuned model on task Clintox. Lastly, we show that standard attribution interpretability methods, when applied to BARTSmiles, highlight certain substructures that chemists use to explain specific properties of molecules. The code and the pretrained model are publicly available.

translated by 谷歌翻译

本文介绍了一个有效的基于补丁的计算模块，基于熵的补丁编码器（EPE）模块，用于资源受限的语义分割。 EPE模块由三个轻巧的全趋验编码器组成，每个编码器都会从图像贴片中提取特征，并具有不同量的熵。编码器的参数数量最多，带有中等熵的贴片由具有中等数量的参数处理，并且具有适度的参数的编码器正在处理高熵的补丁，并且最小的编码器处理了低熵的贴片。模块背后的直觉是：由于具有高熵的补丁包含更多信息，因此它们需要具有更多参数的编码器，与低熵补丁不同，可以使用小编码器处理。因此，通过较小的编码器处理部分可以显着降低模块的计算成本。实验表明，EPE可以提高现有的实时语义分割模型的性能，并略有增加计算成本。具体而言，EPE将DFANET A的MIOU性能提高了0.9％，而参数数量仅增加1.2％，而Edanet的MIOU性能则增加了1％，而模型参数增加了10％。

translated by 谷歌翻译

为化疗中的许多重要任务收集标记数据是耗时的，需要昂贵的实验。近年来，机器学习已被用来使用大规模未标记的分子数据集学习分子的丰富表示，并转移知识，以解决有限数据集的更具挑战性的任务。变形AutoEncoders是已经提出用于进行化学性质预测和分子产生任务的转移的工具之一。在这项工作中，我们提出了一种简单的方法，可以通过在变形自身偏析者学习的表示中包含关于相关分子描述符的附加信息来改善机器学习模型的化学性质预测性能。我们验证了三个属性预测的方法询问。我们探讨了合并的描述符的数量，描述符和目标属性之间的相关性，数据集等的尺寸的影响。最后，我们显示了性能预测模型的性能与属性预测数据集之间的距离和更大的未标记之间的关系。 DataSet在表示空间中。

translated by 谷歌翻译